Dataset characteristics:
- Observations: 103 day-pairs
- Deployments: 7
- Date range: 2023-11-19 to 2024-02-03
- Temporal structure: Consecutive days within deployments
I was motivated to do this analysis after a conversation with Jessica Griffiths, who was not completely convinced by the results of our 30-minute analysis. She cited observing butterflies relocating at Pismo to the leeward side of trees, which I have also observed. I think we should also take Leong’s observations in good faith that there is at least some anecdotal evidence of monarchs responding to wind. So I took a very similar approach to the lagged analysis we did before, but over days instead of 30 minutes. To calculate the change from the previous day (t-1) to the current day (t), I use the 95th percentile of the count for those days (more on that below).
Rather than analyzing counts at arbitrary time points, I use the 95th percentile of each day’s counts. This approach:
I examine previous-day conditions affecting next-day abundance because:
Dataset characteristics:
- Observations: 103 day-pairs
- Deployments: 7
- Date range: 2023-11-19 to 2024-02-03
- Temporal structure: Consecutive days within deployments
Response Variable: Square-root transformed difference in 95th percentile butterfly counts between consecutive days (butterfly_diff_95th_sqrt)
Selecting an appropriate response variable is critical for valid statistical inference. I systematically evaluated multiple candidate response variables and transformations to identify the distribution that best meets model assumptions.
I considered three measures of daily abundance change:
max_butterflies_t - max_butterflies_t_1butterflies_95th_percentile_t - butterflies_95th_percentile_t_1 (more robust to outliers)butterflies_top3_mean_t - butterflies_top3_mean_t_1For each candidate, I tested two transformations:
I assessed normality using skewness, kurtosis, and the Shapiro-Wilk and Anderson-Darling tests to evaluate symmetry, tail behavior, and overall distributional fit.
A composite normality score (0-1 scale) was calculated by weighting all criteria, with higher scores indicating better normality.
Figure. Normality assessment for candidate response variables and transformations. Each panel shows the distribution (histogram) with a normal curve overlay (red line). Rank indicates overall normality score, with Rank 1 being most normal. The normality score combines skewness, kurtosis, and formal normality tests.
Winner: butterfly_diff_95th_sqrt (square-root transformed difference in 95th percentile counts)
I considered numerous potential environmental predictors characterizing full-day weather patterns. All predictors represent weather conditions from the previous day (t-1), as those are the conditions the butterflies experienced. This analysis examines whether there is a durable change the following day (t). The tables below describe all variables initially considered for model inclusion, though not all were used (see correlation analysis below).
Similar to the 30-minute analysis, these metrics help control for differences in cluster sizes and follow the same structure as the response variables.
| Variable | Description | Units | Role |
|---|---|---|---|
| max_butterflies_t_1 | Maximum count observed on previous day | count | Baseline |
| butterflies_95th_percentile_t_1 | 95th percentile of counts on previous day (baseline population measure) | count | Baseline |
| butterflies_top3_mean_t_1 | Mean of top 3 counts on previous day | count | Baseline |
I only considered one sun exposure metric: the sum of all butterflies in direct sunlight. I could have tried the maximum, but the sum felt like a better proxy for the entire day.
| Variable | Description | Units | Role |
|---|---|---|---|
| sum_butterflies_direct_sun_t_1 | Sum of all butterflies observed in direct sunlight throughout previous day (proxy for sun exposure) | cumulative count | Predictor |
Because we are considering an entire day, I generated several reasonable temperature metrics, though not all were used in the final models.
| Variable | Description | Units |
|---|---|---|
| temp_max_t_1 | Maximum temperature during daylight hours (6am-6pm) | °C |
| temp_min_t_1 | Minimum temperature during daylight hours | °C |
| temp_mean_t_1 | Mean temperature during daylight hours | °C |
| temp_at_max_count_t_1 | Temperature at the time when maximum butterfly count was observed | °C |
| hours_above_15C_t_1 | Number of hours with temperature ≥15°C (activity threshold) | hours |
| degree_hours_above_15C_t_1 | Cumulative degree-hours above 15°C (thermal energy available for activity) | degree-hours |
To capture the full story of each day, I generated as many reasonable wind metrics as I could think of, though not all were used in the final models.
| Variable | Description | Units |
|---|---|---|
| wind_avg_sustained_t_1 | Mean sustained wind speed during daylight hours | m/s |
| wind_max_gust_t_1 | Maximum wind gust speed recorded (strongest wind event) | m/s |
| wind_gust_sum_t_1 | Sum of all gust speed measurements (cumulative wind exposure) | m/s |
| wind_gust_sum_above_2ms_t_1 | Sum of gust speeds when wind >2 m/s (threshold-weighted exposure) | m/s |
| wind_gust_hours_t_1 | Integral of gust speed over time (gust-hours of exposure) | gust-hours |
| wind_minutes_above_2ms_t_1 | Number of minutes with wind ≥2 m/s (duration above activity threshold) | minutes |
| wind_gust_sd_t_1 | Standard deviation of gust speeds (wind variability/gustiness) | m/s |
| wind_mode_gust_t_1 | Most frequent gust speed in 0.5 m/s bins (typical wind condition) | m/s |
This correlation matrix shows all potential fixed effects initially considered. I manually reviewed this table to make decisions about which variables to include in the model runs.
For previous-day counts, I simply followed the chosen response variable (95th percentile). Sun exposure had only one option.
For temperature, there were several highly correlated pairs. I ultimately decided on temp_min and temp_max as useful predictors given the thermoregulation of monarchs, and also temp_at_max_count because it was not particularly correlated with other variables and temp_mean was too similar to max temperature.
Wind metrics were surprisingly correlated. wind_gust_sd stood out as a variable that avoided the worst correlation with other predictors, but still had a correlation of 0.76 with wind_max_gust. I ultimately chose wind_max_gust because it reasonably captures all other wind metrics (except wind_mode_gust, which I don’t know how to interpret) and directly matches our previous analysis.
After correlation analysis, the following variables were selected for model testing to minimize multicollinearity while maintaining biological interpretability:
| Variable | Role | Description |
|---|---|---|
| butterfly_diff_95th_sqrt | Response | Square-root transformed change in 95th percentile count between consecutive days |
| butterflies_95th_percentile_t_1 | Baseline (B models only) | Previous day’s 95th percentile count (baseline population size) |
| temp_max_t_1 | Predictor | Previous day’s maximum temperature (°C) |
| temp_min_t_1 | Predictor | Previous day’s minimum temperature (°C) |
| temp_at_max_count_t_1 | Predictor | Temperature when peak count occurred on previous day (°C) |
| wind_max_gust_t_1 | Predictor | Previous day’s maximum wind gust speed (m/s) |
| sum_butterflies_direct_sun_t_1 | Predictor | Sum of butterflies in direct sun on previous day |
This correlation matrix shows only the variables ultimately included in model testing, representing a reduced set with lower multicollinearity:
| butterfly_diff_95th_sqrt | butterflies_95th_percentile_t_1 | temp_max_t_1 | temp_min_t_1 | temp_at_max_count_t_1 | wind_max_gust_t_1 | sum_butterflies_direct_sun_t_1 | |
|---|---|---|---|---|---|---|---|
| butterfly_diff_95th_sqrt | 1.000 | -0.389 | -0.112 | -0.042 | 0.145 | 0.193 | -0.072 |
| butterflies_95th_percentile_t_1 | -0.389 | 1.000 | -0.146 | -0.299 | -0.132 | -0.211 | 0.442 |
| temp_max_t_1 | -0.112 | -0.146 | 1.000 | 0.173 | 0.215 | -0.334 | 0.016 |
| temp_min_t_1 | -0.042 | -0.299 | 0.173 | 1.000 | 0.351 | 0.210 | -0.331 |
| temp_at_max_count_t_1 | 0.145 | -0.132 | 0.215 | 0.351 | 1.000 | -0.116 | 0.098 |
| wind_max_gust_t_1 | 0.193 | -0.211 | -0.334 | 0.210 | -0.116 | 1.000 | -0.122 |
| sum_butterflies_direct_sun_t_1 | -0.072 | 0.442 | 0.016 | -0.331 | 0.098 | -0.122 | 1.000 |
I tested 105 models in two distinct families to distinguish different mechanisms by which environment affects abundance:
M Models (n=50): Absolute Effects
B Models (n=55): Proportional/Density-Dependent Effects
butterflies_95th_percentile_t_1 as covariate or interaction termWithin each family, models varied in:
deployment_idday_sequence | deployment_idSimple M models (absolute effects):
M1: butterfly_diff_95th_sqrt ~ 1
M2: butterfly_diff_95th_sqrt ~ wind_max_gust_t_1
M24: butterfly_diff_95th_sqrt ~ s(wind_max_gust_t_1)
Complex M models:
M34: butterfly_diff_95th_sqrt ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
M41: butterfly_diff_95th_sqrt ~ temp_at_max_count_t_1 * wind_max_gust_t_1
B models (proportional effects):
B1: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1
B19: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1
B31: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1)
B51: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1
Note: Full model fitting and results are documented in the analysis document.
This summary presents key findings for interpretation.
Total models tested: 105 (50 M models + 55 B models)
Each model tested with 2 correlation structures: no_corr and AR1
Total model fits: 210
The table below shows the 10 best-performing models. Note that all top models are B models (include baseline count), indicating that proportional/density-dependent effects provide substantially better fit than absolute effects.
Model naming convention: - First part (B19, B31, etc.): Model formula specification (see Appendix A) - Last part (AR1 or no_corr): Correlation structure used
| Model | Correlation | AIC | Delta_AIC | AIC_weight | df |
|---|---|---|---|---|---|
| B33_AR1 | AR1 | 668.401 | 0.000 | 0.148 | 9 |
| B29c_AR1 | AR1 | 668.671 | 0.270 | 0.129 | 8 |
| B28_AR1 | AR1 | 669.101 | 0.700 | 0.104 | 7 |
| B35_AR1 | AR1 | 669.573 | 1.172 | 0.082 | 13 |
| B37_AR1 | AR1 | 669.594 | 1.193 | 0.081 | 15 |
| B29_AR1 | AR1 | 669.685 | 1.284 | 0.078 | 6 |
| B34_AR1 | AR1 | 670.016 | 1.615 | 0.066 | 11 |
| B29a_AR1 | AR1 | 670.504 | 2.103 | 0.052 | 7 |
| B38_AR1 | AR1 | 670.691 | 2.289 | 0.047 | 10 |
| B29d_AR1 | AR1 | 670.864 | 2.463 | 0.043 | 8 |
Key findings: - All top 10 models are B models (include baseline butterfly count) - All top 10 models use AR1 correlation structure - ΔAIC between best M model and best B model: >50 (see full table in Appendix B)
Multiple models show strong empirical support, suggesting uncertainty in the precise form of environmental effects:
Number of models with ΔAIC < 2: 7
| Model | Correlation | Formula | AIC | Delta_AIC | AIC_weight | df |
|---|---|---|---|---|---|---|
| B33_AR1 | AR1 | butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) | 668.4011 | 0.0000 | 0.1477 | 9 |
| B29c_AR1 | AR1 | butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1) | 668.6711 | 0.2700 | 0.1291 | 8 |
| B28_AR1 | AR1 | butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1) | 669.1010 | 0.6999 | 0.1041 | 7 |
| B35_AR1 | AR1 | butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) | 669.5730 | 1.1719 | 0.0822 | 13 |
| B37_AR1 | AR1 | butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) | 669.5941 | 1.1930 | 0.0814 | 15 |
| B29_AR1 | AR1 | butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) | 669.6853 | 1.2842 | 0.0777 | 6 |
| B34_AR1 | AR1 | butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) | 670.0163 | 1.6152 | 0.0659 | 11 |
Given the model selection uncertainty (multiple models with ΔAIC < 2), it’s important to examine the top-ranked models in detail to understand which patterns are robust across competing formulations and which depend on model specification.
| Rank | Model | Formula | Correlation | AIC | Delta_AIC | AIC_Weight | R_squared | Deviance_Explained |
|---|---|---|---|---|---|---|---|---|
| 1 | B33_AR1 | butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) | AR1 | 668.401 | 0.00 | 0.1477 | 0.2264 | NA |
| 2 | B29c_AR1 | butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1) | AR1 | 668.671 | 0.27 | 0.1291 | 0.1753 | NA |
| 3 | B28_AR1 | butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1) | AR1 | 669.101 | 0.70 | 0.1041 | 0.1679 | NA |
Key observations:
Formula: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
Family: gaussian
Link function: identity
Formula:
butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 +
s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.444416 1.263453 2.726 0.00766 **
butterflies_95th_percentile_t_1 -0.037703 0.006972 -5.408 4.95e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(wind_max_gust_t_1) 2.466 2.466 2.725 0.08649 .
s(sum_butterflies_direct_sun_t_1) 2.918 2.918 6.122 0.00245 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R-sq.(adj) = 0.226
Scale est. = 43.072 n = 100
Figure. Partial response curves for Model 1. Each panel shows the smooth effect of a predictor on the response variable (change in 95th percentile butterfly count), holding other variables constant.
Figure. Diagnostic plots for Model 1 showing residual patterns, normality assessment, variance homogeneity, and residual distribution.
Formula: butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1)
ΔAIC: 0.27
Family: gaussian
Link function: identity
Formula:
butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) +
s(wind_max_gust_t_1)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.078 1.068 -1.009 0.315
Approximate significance of smooth terms:
edf Ref.df F p-value
s(butterflies_95th_percentile_t_1) 1.153 1.153 24.194 1.55e-06 ***
s(wind_max_gust_t_1) 2.491 2.491 2.877 0.0649 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R-sq.(adj) = 0.175
Scale est. = 44.916 n = 100
Figure. Partial response curves for Model 2.
Figure. Diagnostic plots for Model 2.
Formula: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1)
ΔAIC: 0.7
Family: gaussian
Link function: identity
Formula:
butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 +
s(sum_butterflies_direct_sun_t_1)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.560773 1.315741 2.706 0.00807 **
butterflies_95th_percentile_t_1 -0.038876 0.007134 -5.449 3.97e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(sum_butterflies_direct_sun_t_1) 2.886 2.886 6.284 0.00297 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R-sq.(adj) = 0.168
Scale est. = 46.265 n = 100
Figure. Partial response curves for Model 3.
Figure. Diagnostic plots for Model 3.
The wind effects show unexpected patterns in these results. The direct sunlight relationship is also interesting, with an initial dip followed by increases in cluster size with more light exposure. I’m uncertain about the interpretation of these results and whether they should be included in our conclusions. I’d appreciate your thoughts on these patterns.
Models without previous day butterfly count, testing absolute environmental effects:
Null and single predictor models: - M1: ~ 1 (null model) - M2: ~ wind_max_gust_t_1 - M3: ~ temp_max_t_1 - M4: ~ temp_min_t_1 - M5: ~ temp_at_max_count_t_1 - M6: ~ sum_butterflies_direct_sun_t_1
Temperature combinations (linear): - M8: ~ temp_max_t_1 + temp_min_t_1 - M9: ~ temp_max_t_1 + temp_at_max_count_t_1 - M10: ~ temp_min_t_1 + temp_at_max_count_t_1 - M11: ~ temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1
Two-variable combinations: - M12: ~ wind_max_gust_t_1 + temp_max_t_1 - M13: ~ wind_max_gust_t_1 + temp_min_t_1 - M14: ~ wind_max_gust_t_1 + temp_at_max_count_t_1 - M15: ~ wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M16: ~ temp_at_max_count_t_1 + sum_butterflies_direct_sun_t_1
Full models with various temperature specs (linear): - M17: ~ temp_max_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M18: ~ temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M19: ~ temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M20: ~ temp_max_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M21: ~ temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1
Single smooth terms: - M24: ~ s(wind_max_gust_t_1) - M25: ~ s(temp_max_t_1) - M26: ~ s(temp_min_t_1) - M27: ~ s(temp_at_max_count_t_1) - M28: ~ s(sum_butterflies_direct_sun_t_1)
Smooth term combinations: - M30: ~ s(temp_max_t_1) + s(temp_min_t_1) - M31: ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) - M32: ~ s(temp_at_max_count_t_1) + s(sum_butterflies_direct_sun_t_1) - M33: ~ s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
Complex smooth models: - M34: ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M35: ~ s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M37: ~ s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
Mixed linear and smooth: - M38: ~ temp_at_max_count_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M39: ~ s(temp_at_max_count_t_1) + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M40: ~ s(temp_at_max_count_t_1) + wind_max_gust_t_1 + s(sum_butterflies_direct_sun_t_1)
Interactions: - M41: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 - M42: ~ temp_at_max_count_t_1 * sum_butterflies_direct_sun_t_1 - M43: ~ wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - M44: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M45: ~ temp_at_max_count_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - M46: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1
Temperature range models: - M47: ~ I(temp_max_t_1 - temp_min_t_1) - M48: ~ I(temp_max_t_1 - temp_min_t_1) + wind_max_gust_t_1 - M49: ~ s(I(temp_max_t_1 - temp_min_t_1)) - M50: ~ s(I(temp_max_t_1 - temp_min_t_1)) + s(wind_max_gust_t_1)
All models include butterflies_95th_percentile_t_1 to test density-dependent effects:
Baseline-only: - B1: ~ butterflies_95th_percentile_t_1
Single predictor + baseline (linear): - B2: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 - B3: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 - B4: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 - B5: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 - B6: ~ butterflies_95th_percentile_t_1 + sum_butterflies_direct_sun_t_1
Temperature combinations + baseline: - B8: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 - B9: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_at_max_count_t_1 - B10: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 + temp_at_max_count_t_1 - B11: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1
Two-variable combinations + baseline: - B12: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_max_t_1 - B13: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_min_t_1 - B14: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_at_max_count_t_1 - B15: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B16: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + sum_butterflies_direct_sun_t_1
Full models + baseline: - B17: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B18: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B19: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B20: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B21: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1
Single smooth + baseline: - B24: ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) - B25: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) - B26: ~ butterflies_95th_percentile_t_1 + s(temp_min_t_1) - B27: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) - B28: ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1)
Smooth baseline variations: - B29: ~ s(butterflies_95th_percentile_t_1) - B29a: ~ s(butterflies_95th_percentile_t_1) + wind_max_gust_t_1 - B29b: ~ s(butterflies_95th_percentile_t_1) + temp_at_max_count_t_1 - B29c: ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1) - B29d: ~ s(butterflies_95th_percentile_t_1) + s(temp_at_max_count_t_1)
Smooth combinations + baseline: - B30: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) - B31: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) - B32: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(sum_butterflies_direct_sun_t_1) - B33: ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
Complex smooth + baseline: - B34: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B35: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B37: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
Mixed linear and smooth + baseline: - B38: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B39: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B40: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + wind_max_gust_t_1 + s(sum_butterflies_direct_sun_t_1)
Interactions + baseline: - B41: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 - B42: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * sum_butterflies_direct_sun_t_1 - B43: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - B44: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B45: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - B46: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1
Temperature range + baseline: - B47: ~ butterflies_95th_percentile_t_1 + I(temp_max_t_1 - temp_min_t_1) - B48: ~ butterflies_95th_percentile_t_1 + I(temp_max_t_1 - temp_min_t_1) + wind_max_gust_t_1 - B49: ~ butterflies_95th_percentile_t_1 + s(I(temp_max_t_1 - temp_min_t_1)) - B50: ~ butterflies_95th_percentile_t_1 + s(I(temp_max_t_1 - temp_min_t_1)) + s(wind_max_gust_t_1)
Interactions with baseline (density-dependent environmental effects): - B51: ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1 - B52: ~ butterflies_95th_percentile_t_1 * temp_at_max_count_t_1 - B53: ~ butterflies_95th_percentile_t_1 * sum_butterflies_direct_sun_t_1 - B54: ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1 + temp_at_max_count_t_1 - B55: ~ butterflies_95th_percentile_t_1 * temp_at_max_count_t_1 + wind_max_gust_t_1
Note: All models include random intercept for deployment_id and were tested with both no correlation and AR1 correlation structures.
Summary statistics:
Total models fitted: 200
Models with ΔAIC < 2: 7
Models with ΔAIC < 4: 13
Models with ΔAIC < 10: 32
Best M model (no baseline):
Model: M34_AR1
ΔAIC: 14.07
Best B model (with baseline):
Model: B33_AR1
ΔAIC: 0
Document prepared: 2025-09-30 Analysis code: analysis/monarch_daily_gam_analysis.qmd Data source: data/monarch_daily_lag_analysis.csv Data preparation: data_prep_daily_lag.py